Introduction

Amazon Web Services (AWS) is a cloud computing platform that provides a range of on-demand services:

Because these services can be scaled on-demand, it allows organizations to ...

This document covers general concepts for commonly used AWS services. The document ...

⚠️NOTE️️️⚠️

The reason for this is that the underlying technology and prices will change as years go by, but the concepts are likely to remain relevant over time.

There are too many AWS services to cover everything in this document.

Geography

↩PREREQUISITES↩

AWS infrastructure is spread out across the world, divided into regions and availability zones:

Each region is said to have at least 2 AZs, where those AZs are connected together using high-speed network connections. The high-speed network connections between AZs serve two purposes:

  1. Abstraction: A region can be thought of as a single data center.
  2. Redundancy: An outage in one AZ shouldn't affect other AZs in the same region.

Kroki diagram output

⚠️NOTE️️️⚠️

Not all AWS services are available in every region.

Amazon Resource Name

↩PREREQUISITES↩

An Amazon Resource Name (ARN) is an identifier for a resource. ARNs are unique across all of AWS, not just a single account. Generally, an ARN should follow one of the following formats, ...

..., where ...

For example arn:aws:iam::123456789012:user/jimmy and arn:aws:sns:us-east-1:123456789012:example-topic are valid ARNs.

For some services, {region} and {account-id} should be omitted as the service's resources are unique across the entirety of AWS. For example, there can only be one S3 bucket name my_bucket across all of accounts and regions in AWS: arn:aws:s3:::my_bucket/Directory1.

Should the service not have unique resources across all of AWS but {region} and {account-id} are still omitted, the default region and the default account ID are used. For example, arn:aws:ec2:::instance/my-instance will target the default region and implicitly load in the account ID.

An ARN may include wildcards. For example, arn:aws:s3:::my_bucket targets an individual S3 bucket, but arn:aws:s3:::my_bucket/* targets all the contents of that bucket.

Identity and Access Management

Identity and access management (IAM) is AWS's mechanism of managing authentication and authorization. It controls which entities (e.g. authenticates human users or software services) can access which parts of an AWS account (e.g. authorizes access to delete a database).

IAM breaks down access control as policies. An IAM policy defines whether particular action can be performed of particular resource (e.g. if you can create an EC2 instance). That IAM policy can then be applied to a ...

Kroki diagram output

IAM users / IAM groups / IAM roles are forked off from the root user, which is the user that gets created when the AWS account is created. Whereas the default behavior for a root user is to have unrestricted access to all services and resources under the AWS account, the default behavior for these forked off entities is to deny access. Policies can then be assigned to these entities to allow access to particular services and resources.

⚠️NOTE️️️⚠️

Standard practice is for the root user to not make changes directly, but to create various IAM users / IAM groups / IAM roles, each limited to only the permissions necessary to do the tasks needed.

IAM users may have multiple policies for a specific resource and action (e.g. an IAM user could be both in IAM group A and IAM group B, both of which have a policy for reading an S3 bucket). Should any of those multiple policies be set to deny, the denial takes precedence.

⚠️NOTE️️️⚠️

One piece of AWS that's deeply related to this area is AWS CloudTrail, which keeps an audit log of which users performed what access at what time.

Policies

↩PREREQUISITES↩

An IAM policy has 4 parts to it: Effect, action, resource, and condition.

  1. Effect: A flag indicating if access is to be allowed or denied.

  2. Action: A service and verb, describing the action to be taken on that service

  3. Resource: An ARN to target.

    Some resources don't have a region associated with them (e.g. the S3 example above), meaning that the ARN region field is left blank. If the resource does require a ...

  4. Condition: A condition that must be met for the policy to apply.

    ⚠️NOTE️️️⚠️

    AWS has many condition keys. Too many to discuss here. Some are specific to a service (e.g EC2 or S3), others are global. Check out here for a good starting point.

{
    "//": "Allow terminating __any__ EC2 instances for current account.",
    "//": " - ARN is missing account ID, meaning it defaults to current account ID.",
    "//": " - Condition requires MFA be enabled for policy to apply.",
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "my_statement_id_1234",
            "Effect": "Allow",
            "Action": [
                "ec2:StartInstances",
                "ec2:StopInstances",
                "ec2:TerminateInstances"
            ],
            "Resource": "arn:aws:ec2:us-east-1::instance/*",
            "Condition": {
                "BoolIfExists": {"aws:MultiFactorAuthPresent": true}
            }
        }
    ]
}

An IAM policy may also target ...

{
    "//": "Deny terminating EC2 instances __except__ for 'personal-compute'.",
    "//": " - ARN is missing account ID, meaning it defaults to current account ID.",
    "//": " - Condition requires MFA be enabled for policy to apply.",
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "my_statement_id_1234",
            "Effect": "Deny",
            "Action": "ec2:TerminateInstances",
            "NotResource": "arn:aws:ec2:us-east-1::instance/personal-compute",
            "Condition": {
                "BoolIfExists": {"aws:MultiFactorAuthPresent": true}
            }
        }
    ]
}

⚠️NOTE️️️⚠️

Recall that when there are multiple policies for the same resource and action (e.g. an IAM user could be both in IAM group A and IAM group B, both of which have a policy for reading an S3 bucket), denials always take precedence.

NotResource and NotAction are commonly used with Deny.

Policies come in two varieties:

⚠️NOTE️️️⚠️

The learning material advises against inline policies. Instead, it says you can create a managed policy and set a condition on it such that it only applied when attached to the identity of interest: "StringEquals": { "aws:username": "johndoe" }.

Boundaries

↩PREREQUISITES↩

An IAM permission boundary specifies the maximum permissions that an identity may have. For example, the IAM permission boundaries shown below dictate that the maximum permissions are any action performed on an S3 or EC2 resource.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:*",
                "ec2:*"
            ],
            "Resource": "*"
        }
    ]
}

The IAM permission boundaries above can then be assigned to an identity. If the identity were to be assigned permissions outside of those listed in the example above, access to those permissions would still be denied.

⚠️NOTE️️️⚠️

The learning material says that IAM permission boundaries don't apply to IAM groups, only IAM users and IAM roles.

Access Keys

Access keys are credentials that allow users (either the root user or an IAM user) to programmatically access AWS (e.g. via a Python script or AWS CLI). Each access key is comprised of two parts ...

  1. an access key ID, which is analogous to a username.
  2. a secret access key, which is analogous to a password.

Both the access key ID and the secret access key must be used together to authenticate against AWS.

Whereas a human typically uses a single username and password to access AWS, programmatic access typically has one access key per program. Should a program's access key need to be rotated or disabled, it can be done without interfering with access keys used by other programs.

Virtual Private Cloud

↩PREREQUISITES↩

A Virtual Private Cloud (VPC) is an isolated network within AWS that resembles the type of network you'd encounter in a traditional data center (e.g. firewalls, access controls, NATs, peering connections, etc...). An account may have many VPCs (including multiple VPCs within the same region), but each VPC is tied to exactly one account and one region. For example, one account may have three VPCs, one in the eastern US region and two in the western US region.

Kroki diagram output

Each VPC has an IP address range associated with it, defined as a CIDR block. For ...

⚠️NOTE️️️⚠️

If possible, ensure that IP ranges between your VPCs / networks don't overlap. It's common for VPCs to connect with other VPCs and external networks (as if everything is on the same network), and you don't want to be in a situation where you have address conflicts.

Since a VPC is tied to a specific region, and each region has multiple AZs, each VPC has its IP address range further partitioned to uniquely target AZs within the region it's in (via a further restricted CIDR block). This further partitioned IP address space is referred to as a subnet.

Kroki diagram output

Regardless of the subnet CIDR block's prefix size, AWS reserves the first 4 addresses and last address in the IP address range. For example, a subnet of 10.0.1.0/24 has addresses 10.0.1.0-3 and 10.0.1.255 are reserved by AWS.

Implicit Router

Each subnet has an implicit router associated with it, located at the first address in the subnet's IP address range plus 1. For example, if a subnet's CIDR block is 10.0.1.0/24, its implicit router will be located at 10.0.1.1. This implicit router directs all of the subnet's traffic, whether that traffic is ...

The implicit router decides where traffic should be directed using a set of rules referred to as a route table. Each rule in the route table is referred to as a route.

Kroki diagram output

Each route is defined by two pieces of data:

  1. An IP range to route, specified as a CIDR block.
  2. A target within the VPC to route to, such as an Internet Gateway, NAT, or another VPC.

For example, the following route table specifies that the CIDR blocks ....

Destination IP Range Target
10.0.0.0/16 Local
2001:db8:1234:1aff::/56 Local
0.0.0.0/0 igw-4aac
::/0 eigw-f0a0
10.221.3.0/24 vpn-9b1c

⚠️NOTE️️️⚠️

By default, the implicit router will contain a "Local" route that spans the entire VPC (including all subnets).

A route table can be associated with many subnets, but each subnet can only be associated with one route table. Should a subnet not be associated with a route table, that subnet defaults to using its parent VPC's main route table.

Kroki diagram output

⚠️NOTE️️️⚠️

Some tips offered up by the learning material:

Kroki diagram output

Elastic Network Interface

An elastic network interface (ENI) is a virtual network card tied to a specific subnet. Attaching an ENI to an EC2 instance gives that EC2 instance access to that ENI's VPC. An ENI provides the EC2 instance with ...

A single EC2 instances can have many ENIs from different subnets attached, so long as those subnets are within the same AZ as the EC2 instance. However, ...

  1. there must always be a primary ENI attached to the EC2 instance.
  2. there is a maximum number of ENIs an EC2 instance can have attached (depends on the instance type).

Kroki diagram output

Security Group

↩PREREQUISITES↩

A security group is a firewall that defines what traffic is allowed in to and out of an EC2 instance. A rule for ...

Security groups control traffic to an EC2 by attaching to their ENIs. If an ENI is moved from one EC2 instance to another, its security group move with it.

⚠️NOTE️️️⚠️

As of writing this section, each ENI can have up to 5 security groups attached to it.

Kroki diagram output

Network Access Control List

↩PREREQUISITES↩

A network access control list (NACL) is a set rules that defines what traffic is allowed in to and out of one or more subnet. Each rule is defined by a record comprised of the following items:

  1. Number, which defines the rule's evaluation priority in the list of rules.
  2. Type, which defines the type of traffic to look for (e.g. SSH).
  3. Protocol, which defines the type of packet to look for (e.g. TCP or UDP)
  4. Port range, which defines which ports to look for (e.g. 8080-8088)
  5. Source IP CIDR block, which defines which inbound IPs to look for.
  6. Destination IP CDR block, which defines which outbound IPs to look for.
  7. Allow/deny flag, which defines if a matching packet should be denied or allowed.

Unlike security groups, NACLs are applied at the subnet level (instead of the ENI / EC2 instance level) and are stateless in that each incoming / outgoing packet is evaluated individually (it don't keep track of action connections). A NACL can be associated with multiple subnets, but each subnet can be associated with at most one NACL. If a subnet is not associated with a NACL, a default NACL is used.

Kroki diagram output

Internet Gateway

↩PREREQUISITES↩

An Internet gateway (IGW) is a bridge allowing resources in a VPC (e.g. EC2 instances) to communicate with the Internet, provided those resources have a public IPv4 or IPv6 address. IGWs apply to all subnets of a VPC, so long as the implicit router of those subnets has a route for Internet traffic.

Since IPv6 addresses are globally unique, they are public by default. That means that an IPv6 address assigned to a resource in your subnet (e.g. EC2 instance) is a public IPv6 address by default. To support outbound-only communication with the Internet (similar to how a NAT works for IPv4), resources with IPv6 addresses may choose to use an egress-only internet gateway (EIGW) instead of an IGW.

If a subnet's implicit router ...

The default VPCs provided by AWS come with an IGW.

⚠️NOTE️️️⚠️

After creating an IGW / EIGW, don't forget to add it to the implicit router's route table.

Kroki diagram output

NAT Gateway

↩PREREQUISITES↩

A network address translation gateway (NAT gateway, or just NAT) allows resources in a subnet to send outgoing communication over the Internet while preventing unsolicited incoming communication from the Internet. For the NAT gateway to operate, it needs ...

  1. the owning subnet's VPC to have an IGW, otherwise the NAT gateway can't communicate with the Internet.
  2. an EIP assigned to it, otherwise it won't have a public IPv4 address to communicate with the Internet.

⚠️NOTE️️️⚠️

After creating a NAT gateway, don't forget to add it to the implicit router's route table.

NAT gateways are commonly used in scenarios where resources in a private subnet needs outbound Internet access to download things like software patches.

Kroki diagram output

Elastic IP

↩PREREQUISITES↩

An elastic IP (EIP) is a static public IPv4 address. An EIP can be attached to an EC2 instance's ENI to give that EC2 instance a static public IP address. Otherwise, the public IPv4 address of an EC2 instance, if it has one, will change any time an EC2 instance restarts.

Each EIP is associated with a region and an account. An account can change the ENI an EIP is attached to, but that ENI must be in the same region that the EIP is in.

A public IPv4 address, regardless of if it's temporary or an EIP, won't be directly visible to the EC2 instance it's attached to. For example, SSHing into that EC2 instance and running ifconfig -a will not list out that EC2 instance's public IPv4 addresses. That's because traffic between the Internet and that EC2 instance goes through a hidden NAT. Specifically, somewhere in the chain shown below is a NAT that re-maps packets such that source/destination IP addresses are appropriately translated between the EC2 instance's public and private IPv4 address.

Kroki diagram output

⚠️NOTE️️️⚠️

The docs say that IPv6 doesn't have this type of NAT applied. It can talk with the Internet without going through a NAT.

Peering

↩PREREQUISITES↩

While VPCs are isolated from each other, they may be peered together such that resources between them (e.g. EC2 instances) can communicate as if they're on the same network. This peering can span between accounts and / or regions.

Kroki diagram output

VPC peering is a two-step process:

  1. To peer VPCs, a VPC peering request must first be made. The initiator sends a request to the recipient, which the recipient can either accept or deny. The initiator and recipient may be the same account or completely different accounts. Either way, the recipient must explicitly accept the VPC peering request before the VPCs peer.

    Kroki diagram output

  2. Of the two VPCs being peered, each must add the other VPC's IP address range to the routing table of its subnets, where the target of the route is the VPC peering connection.

    Destination IP Range Target
    10.221.9.0/24 Local
    10.221.3.0/24 pcx-9b1c

    ⚠️NOTE️️️⚠️

    To avoid headaches, keep your VPC IP ranges distinct from each other.

⚠️NOTE️️️⚠️

The learning material claims that, ...

Elastic Cloud Compute

↩PREREQUISITES↩

Elastic cloud compute (EC2) is a service that rents out computing resources (e.g. physical machines, virtual machines, GPUs, FPGAs, etc..). Each rented out computing resource is referred to as an EC2 instance.

There are multiple types of EC2 instances available. EC2 instance types are categorized using the following designations:

These designations are typically combined as {instance-family}{instance-generation}{processor-family}{additional-capabilities}.{instance-size}. Not all designations need to be present in the string. For example, the EC2 instance type m6in.4xlarge is broken down as ...

.-General purpose instance family
|
| .-Intel processor
| |
| |  .-4x large instance size
| |  |
m6in.4xlarge
 | |
 | '-Network optimized / EBS optimized
 |
 '-6th instance generation

⚠️NOTE️️️⚠️

The lists above for instance families, additional capabilities, etc.. are non-exhaustive. Two particular classes of compute resource that are important to enterprise but not listed above are ...

  1. dedicated instances: Your EC2 instances are guaranteed to run on physical machines exclusive to you, meaning no other AWS customer will have an EC2 instances on those physical machines. However, there may be multiple physical machines running your EC2 instances. For example, if you were to ask for 2 large and 6 small m6in instance types, one of the large instances could be on physical machine A while the other large and small instances could be on physical machine B. AWS decides which physical machine runs what and instances can hop between physical machines on restart, but all physical machines are guaranteed to only be running your instances (no other AWS customer will be running instances on those physical machines).

  2. dedicated hosts: Your EC2 instances are guaranteed to always run on the same physical machine, which is exclusive to you. That is, you rent the machine and decide how you want it split up into instances.

EC2 instances can be rented using a variety of pricing models:

Storage

EC2 instances can have two types of storage devices for file systems:

Each type of storage device has its own unique set of restrictions / properties. Regardless of which you pick, it's a one-to-one mapping between the storage device and EC2 instance: A single instance store or EBS volume is tied to a single EC2 instance, but a single EC2 instance can have multiple storage devices.

Kroki diagram output

Since these storage devices can't be tied to multiple EC2 instances at the same time, it means you can't use them for sharing data between EC2 instances. For data sharing between EC2 instances, there's Elastic File System (EFS), which is a NFS service provided by AWS.

Instance stores, EBS, and EFS are discussed in the subsections below.

Instance Store

An instance store is a storage device attached to the physical machine that the instance runs on. That storage device is ...

Due to the above restrictions, most customer choose to use EBS volumes over instance stores.

Elastic Block Store

↩PREREQUISITES↩

An Elastic Block Store (EBS) volume is a network attached storage device. Unlike instance stores, EBS volumes are ...

EBS volumes are backed by both ...

Of the above EBS volume types, only one type allows the user to explicitly target performance: Provisioned IOPS SSD. The other EBS volume types use elaborate algorithms where performance is gated based on volume size and / or performance is burstable based on activity. Regardless, workloads targeting high performance should target EBS-optimized instance types. Either the instance types will come with EBS-optimization or it may be tacked-on for an additional fee.

⚠️NOTE️️️⚠️

Actual numbers have been left out (e.g. min/max IOPS, min/max volume size, etc..) because I'm guessing these numbers will probably change in the future. To see how an IOPS is currently defined, see here.

⚠️NOTE️️️⚠️

To determine how effectively you're using EBS volumes, you can use CloudWatch. Specifically, ...

Higher numbers mean better performance: If it's ...

EBS volumes support ...

EBS volumes can only be attached to at most one EC2 instance at a time, and both the EBS volume and EC2 instance must be in the same AZ.

Elastic File System

↩PREREQUISITES↩

Elastic File System (EFS) is a service that provides networked file sharing (NFS) for EC2 instances. EFS supports ...

⚠️NOTE️️️⚠️

NFS doesn't behave exactly the same as a local file system: Some Linux system calls (e.g. flock()) won't work on NFS.

There is no out-of-the-box NFS support for Windows.

EFS volumes are categorized by storage class:

EFS volumes are durable (data automatically replicated) and automatically scale storage capacity up/down as required. Data on EBS volumes are stored either ...

The latter, referred to as EFS One Zone, is provided at lower cost due to a greater risk of outages. While both protect against drive failures, EFS One Zone will fail completely should the AZ that it's operating in go down.

EFS volumes have two performance classes:

⚠️NOTE️️️⚠️

There are some restrictions here with EFS One Zone. The documentation right now recommends using General Purpose for everything.

I'm assuming these performance classes are only valid for the Standard storage class. It doesn't make sense otherwise. If you're using Infrequent Access or Archive, they're already severely gating how often you access data?

EFS volumes have three throughput classes:

⚠️NOTE️️️⚠️

Like with EBS volumes, EFS volumes using a performance class other than Provisioned Throughput have a strange burst credit system, where credits build up over time and you can use them for short periods of faster read throughput.

EFS requires network access to port 2049, meaning the security group used by the EC2 instances may need to be updated. Once network access is allowed, an EFS volume may be mounted manually using a helper utility called amazon-efs-utils.

sudo yum install -y amazon-efs-utils
sudo mkdir ./my-efs
sudo mount -t efs {efs-identifier}:/ ./my-efs

The amazon-efs-utils helper utility streamlines the use of features specific to EFS, such as automatically handling encryption during transit.

⚠️NOTE️️️⚠️

Is updating security groups a hard requirement or does AWS automatically handle this somehow? If so, are there other access control mechanisms in the VPC that need to be updated as well? Maybe NACL?

⚠️NOTE️️️⚠️

It's advised that you use amazon-efs-utils to deal with EFS, but you aren't forced to use it. You can also use the standard NFS utilities that come with your Linux distro:

sudo yum -y install nfs-utils
sudo mkdir ./my-efs
sudo mount -t nfs -o rsize=..,wsize=..,... {efs-dns-name}:/ ./my-efs
sudo apt-get -y install nfs-common
sudo mkdir ./my-efs
sudo mount -t nfs -o rsize=..,wsize=..,... {efs-dns-name}:/ ./my-efs

⚠️NOTE️️️⚠️

Auto-mounting an EFS volume may be done by updated /etc/fstab to include the line filesystem-id:/ mount-target efs default,_netdev 0 0. You can do this either manually or through a "cloud-init" you use to initialize your EC2 instance on creation.

Amazon Machine Image

↩PREREQUISITES↩

An Amazon Machine Image (AMI) is a raw copy of an EC2 instance, used as a template for new EC2 instances. Each AMI typically contains an operating system (e.g. Ubuntu Linux or Windows), common software packages (e.g. grep, bash, etc..), and data / configurations. Most common AMIs are free to use, while others require a fee of some kind. For example, SAP providers various versions of their software packaged as AMIs, which they charge for.

Any AWS account can have its own set of AMIs, which may be private or publicly-available. AMIs come in two flavors:

For instance store-backed AMIs, if the newly created EC2 instance has a larger instance store than the EC2 instance used to create the AMI, some of the space may go unused. You'll likely need to manually expand the file system.

⚠️NOTE️️️⚠️

An AMI is more than a disk / volume image. Each AMI may contain multiple volumes.

Access

Linux-based EC2 instances can be connected to through a variety of mechanisms. Traditionally, the connection mechanism of choice has been SSH pre-configured with a public/private keypair rather than password credentials. This mechanism requires at least some of your EC2 instances to be publicly exposed (public IP address and open ports). However, newer connection mechanisms don't require public exposure.

The subsections below describe the various connection mechanisms for Linux-based EC2 instances. Non-Linux EC2 instances aren't covered.

SSH

↩PREREQUISITES↩

SSH is the connection mechanism traditionally used to connect to an EC2 instance. In its simplest form, it requires that the EC2 instance be created with a public/private keypair, have a public IP address, and be assigned a security group that allows listening on port 22 (SSH port).

chmod 400 ./my-private-key.pem  # SSH/SCP requires these permissions on the private key
ssh -i ./my-private-key.pem ec2-user@ipaddress                             # Connect to instance
scp -I ./my-private-key.pem ./my-dir/my-local-file.txt ec2-user@ipaddress  # Download from instance
scp -I ./my-private-key.pem ec2-user@ipaddress ./my-dir/my-remote-file.txt # Upload to instance

A common pattern used to increase security is the use of bastion hosts: A bastion host is a short-lived temporary EC2 instance that's publicly accessible, responsible for acting as a bridge to EC2 instances that aren't publicly accessible. This way, EC2 instances running critical services don't need to be exposed to the public (no open ports on public IPs / no public IPs). For example, a database server may not have Internet access, but the database administrator can still access it by first SSHing into a bastion host and then SSHing from that bastion host to the database server.

Instance Connect

↩PREREQUISITES↩

EC2 Instance Connect Endpoint is a VPC component that, once added to a subnet in a VPC, allows for secure shell connections to any private EC2 instance in that VPC (no public IP or open ports required). The endpoint uses the account's access keys as its authentication and authorization mechanism, meaning the EC2 instance doesn't need to be assigned a keypair on creation. However, port 22 (SSH port) still needs to be open on the EC2 instance's private IP.

Once in place, connections can be made either through the ...

Session Manager

↩PREREQUISITES↩

Session Manager is a way to create secure shell connections to a private EC2 instance. It's part of a larger suite of services for customers that have many EC2 instances to manage and maintain (patching, configuration, automation, etc..), referred to as AWS Systems Manager. Unlike EC2 Instance Connect, session manager ...

⚠️NOTE️️️⚠️

Comments online claim port 22 doesn't need to be open because session manager uses reverse connections.

⚠️NOTE️️️⚠️

Session Manager claims that it can handle virtual machines that aren't EC2 instances but still managed by AWS (e.g. on-perm VMs that have been linked up to / are managed by AWS). Does EC2 Instance Connect also support this?

For session manager to work, it requires that the "SSM agent" be installed on the EC2 instance. Common Windows and Linux AMIs should already have this agent installed. If not, a standalone installer is available.

Once in place, connections can be made either through the ...

Scaling

↩PREREQUISITES↩

Scaling refers to the increasing or decreasing of computing resources to meet increasing or decreasing application load, respectively. The goal of scaling is to maintain equilibrium between preserving the application's ability to service requests (e.g. prevent crashes or slowdowns when requests come in faster than normal) and efficient utilization of computing capacity (e.g. prevent wasting money on EC2 instances that aren't needed).

EC2 instances are scalable both ...

The subsections below describe how EC2 supports each type of scaling.

Vertical Scaling

Vertical scaling changes an EC2 instance's resources and attributes (e.g. core count, RAM, networking, etc..) by changing its instance type. For the instance type to be changed, the EC2 instance first has to be stopped then started again. The newly selected instance type can have a different instance family, instance generation, and / or have different instance attributes applied (e.g. enhanced networking).

⚠️NOTE️️️⚠️

I doubt hibernate can be used when changing instance type. If the instance is using EBS, at least data saved to disk should be retained.

Horizontal Scaling

Horizontal scaling, referred to as EC2 Autoscaling, changes the number of EC2 instances that are working together on the same goal (e.g. serving websites, serving data, computing something, etc..). EC2 Autoscaling groups together EC2 instances it controls into groups, referred to as autoscaling groups. An autoscaling group has its EC2 instances increased and decreased based on ...

EC2 Autoscaling has two ways of launching and configuring EC2 instances: Launch configuration and launch templates. While launch configurations are simpler to configure, launch templates enable advanced features such as mixing-and-matching instance types as well as combining on-demand instances and spot instances.

⚠️NOTE️️️⚠️

EC2 Autoscaling is almost always used with Elastic Load Balancer, where the load balancer distributes out requests for the EC2 instances to process.

Placement Groups

The proximity of a set of EC2 instances can be influenced using placement groups. A placement group can either place those EC2 instances ...

There are 3 types of placement groups:

⚠️NOTE️️️⚠️

There's probably a lot more to this than what's here. The learning material mentioned that placement groups can't be merged and their names must be unique (within the account).

Instance Profile

↩PREREQUISITES↩

EC2 instances are able to automatically assume an IAM role, referred to as an EC2 instance profile. No access keys (or other credentials) are explicitly required.

⚠️NOTE️️️⚠️

I don't know this works under the hood, but it seems to work. The EC2 instance can run AWS CLI commands without having to do any type of setup. It might be AMI-specific or it could be some other mechanism they've created outside the VM to automatically authentic.

⚠️NOTE️️️⚠️

This might be useful for streamlining application access to S3? That is, your EC2 instance may need access to specific S3 buckets / objects, and this might easily enable that by giving the EC2 instance the correct IAM role?

Simple Storage Service

↩PREREQUISITES↩

Simple Storage Service (S3) is a service for storing and accessing data objects via an API. S3 objects are similar to files in that they're comprised of a name, bytes, and metadata (e.g. content-type, last modified, and potentially custom key-value pairs). However, unlike files, S3 objects aren't organized within a directory hierarchy. Instead, each S3 object belongs to a single container called an S3 bucket.

Kroki diagram output

S3 is categorized by class, where each class is suited for different data access and resiliency requirements:

Operations on S3 are strongly consistent, meaning modifications are immediately available to readers. Previously, S3 was only eventually consistent.

Each S3 bucket supports ...

Pricing for S3 is broken down by the amount of data stored, the number and types of operations invoked (e.g. read, write, delete, etc..), the amount of data transferred, and the options enabled. Pricing changes between classes, based on the use-cases the class is built for. For example, when compared against the standard class, the infrequent access charges less for storage but more for data access.

Simple Notification Service

↩PREREQUISITES↩

Simple Notification Service (SNS) is a serverless message publication-subscription service, where publishers send messages for subscribers to receive. SNS is partitioned based on topics, which are essentially isolated communication channels within SNS: A publisher sends a message on a certain topic (e.g. a topic for whenever a customer makes a purchase, a topic for whenever a new user is created, etc..), and subscribers interested in that topic will receive it.

⚠️NOTE️️️⚠️

Each SNS topic gets its own ARN.

⚠️NOTE️️️⚠️

Unlike with SQS, messages aren't stored. A subscriber that isn't actively listening for messages won't receive those missed messages later on.

Each SNS message has a subject, body, and key-value attributes. SNS has a limit on how large a message can be (inclusive of its metadata). Messages exceeding this limit may store the actual message payload in S3 and reference it in the message body. S3's strong consistency guarantees that the payload will be available to the reader.

⚠️NOTE️️️⚠️

As of today, the SNS message limit is 256k.

Subscribers can filter SNS messages by setting filter policies that test attributes for certain conditions: value equality, suffix match, prefix match, numerical range testing, etc... The attribute conditions may be chained together using AND and OR. For example, consider SNS messages that come with a numeric attribute called price. An SNS subscription filter policy could filter out all messages whose with a price attribute below 10 and above 100.

SNS heavily integrates with other AWS services. Integrations come in two forms:

⚠️NOTE️️️⚠️

For a full list of services, see here and here.

⚠️NOTE️️️⚠️

If I recall correctly, it's possible to have your message have a custom payload for each "destination type" / "destination protocol". For example, the same message can be read for delivery by SMS and for delivery by email. The SMS and email could say different things?

⚠️NOTE️️️⚠️

One of the destination types can be a custom HTTP/HTTPS endpoint.

SNS messages may be encrypted at rest and are encrypted in-transit (via HTTPS communication).

Simple Queue Service

↩PREREQUISITES↩

Simple Queue Service (SQS) is a serverless message queuing service, where messages are placed into a queue for others to pull from. SQS supports two type of queues:

Standard FIFO
Order Guaranteed exact ordering Best effort ordering
Delivery Guaranteed exactly-once Guaranteed at-least-once
Scale Unlimited transactions-per-second Limited transactions-per-second

While standard queues have usage patterns similar to other distributed queues, FIFO queues are slightly differently. FIFO queues guarantee exact ordering for messages within the same group. Each message must have a group ID associated with it. When the queue is pulled from, messages from multiple different group IDs may be returned, but those within the same group ID are guaranteed to be in-order.

⚠️NOTE️️️⚠️

As of today, the throughput limit on FIFO queues is 3k messages-per-second with batching or 0.3k messages-per-second without batching. This can be increased to 70k messages-per-second with batching.

Messages are pulled from an SQS queue using polling. SQS supports two types of polling:

Since SQS messages are distributed / replicated across server shards, short polling may miss some messages. But, those missed messages will likely be returned in subsequent polls as the shards accessed will likely be different.

Pulling messages is a two-step process: Once the message is received by a poller, it goes into an invisible state where other pollers can't receive it (the message isn't deleted). The message stays in that invisible state for a certain duration of time before being made visible again. The point is for the poller who originally received the message to process that message and then explicitly delete it from the queue, ensuring that the message is still available for processing should the poller encounter an error during its processing (e.g. crash). Should the same message be polled multiple times without being deleted, SQS will move into a special queue for unprocessable messages called a dead-letter queue.

Kroki diagram output

⚠️NOTE️️️⚠️

You can set alarms when msg enters dead-letter queue (CloudWatch?).

SQS has a limit on how large a message can be. Messages exceeding this limit may store the actual message payload in S3 and reference it in the message body. S3's strong consistency guarantees that the payload will be available to the reader.

⚠️NOTE️️️⚠️

As of today, the SQS message limit is 256k.

SQS messages may be encrypted at rest and are encrypted in-transit (via HTTPS communication).

Other AWS services have built-in integrations with SQS: Lambda, SNS, etc.. For example, it's possible to link your SQS queue such that it populates from an SNS topic. In such integrations, the deleting of messages is automatically handled.

Lambda

↩PREREQUISITES↩

Lambda is a serverless service that runs code in response to an event. The code, referred to as a function, takes the event as input and processes it in some way (e.g. stores it in a database, sends out an email, etc..).

The function itself may be written in any modern language, but must be packaged as a zip file or a container image. If packaged as a ...

⚠️NOTE️️️⚠️

Lambda puts limits on how big the zip / container can be. If using a ...

If using a zip, one common pattern is to include dependencies within the zip via pip install -r requirements.txt -t . which install the packages into the current directory rather than the Python installation. Another common examples is to use "Lambda layers", which is another zip file that can be shared across various function zips and contains supplementary data like dependencies

If you have dependencies or custom runtimes, using containers may be a better idea.

⚠️NOTE️️️⚠️

See here for a list of base images with language runtimes that have "runtime interface client" support already built-in.

See here for specs on how the "runtime interface client" should be implemented. It sounds like this is a client that queries a server? So the container starts and the first thing it does is pull a domain from the environment variables and queries a web service at that domain for the payload to run? They have pre-built clients up on GitHub for each language runtime.

Assuming a custom "runtime interface client" isn't used, a function typically has the entry point ...

def handler_name(event, context):
    ...  # Code goes here.

Each function is assigned a certain amount of memory for execution and is allowed to run for a short period of time. The number of cores used for execution depends on the amount of memory assigned: The more memory assigned, the more cores are available (linearly scales).

⚠️NOTE️️️⚠️

As of today, the limits are ...

Warm Starts

Once Lambda starts a process for a new function invocation, that process may remain active for some duration for subsequent function invocations. The act of launching an invocation from ...

The warm start mechanism means that state outside of the function entry point may not get reset on a new invocation. For example, in a warm start scenario, global variables will remain as-is from the previous invocation, allowing for those global variables to act as a cache for certain reusable resources (e.g. connections to other AWS services).

s3 = boto3.client('s3')
dynamodb = boto3.resource('dynamodb')

def handler_name(event, context):
    ...  # Code goes here.

⚠️NOTE️️️⚠️

It may be beneficial to have a single function handle multiple different operations (you can change what the function does depending on the event payload). It helps keep things warm.

Configuration

↩PREREQUISITES↩

A function is configurable using environment variables. Environment variables are typically set when the function is deployed onto Lambda. For example, an environment variable might provide the URL of the database server the function communicates with. Environment variables may be encrypted at-rest via KMS.

Functions may also be configured to use a shared EFS, such that data and/or configurations may be shared across invocations.

Invocation Modes

Depending on what triggered the service, a function may be invoked synchronously or asynchronously. Invocations that are ...

As an example, invocations by API Gateway are synchronous (e.g. it expects a response to send back to the HTTP call) while invocations by CloudWatch alarms are asynchronous. Because synchronous invocations block, the invoker typically gates the runtime of the function to a much shorter duration than the maximum allowed by Lambda.

Versioning

Lambda supports the versioning of functions. For example, Lambda makes it possible to deploy a version of a function for beta testing without affecting users who rely on the stable version.

Function versions are often deployed in front of an alias, which acts as a pointer to a function version. The pointer can be updated, such that callers invoking the alias are automatically moved forward to the latest version (or rolled back to a previous version).

API Gateway

↩PREREQUISITES↩

API Gateway is a service that binds web API endpoints to various AWS computing services for processing (e.g. Lambda functions, EC2 instances, ECS, etc..) as well as streamlines web API management.

⚠️NOTE️️️⚠️

Supported web API protocols include REST, HTTP, and WebSocket. This section primarily deals with REST / HTTP.

Each combination of API path (resources) and HTTP method can be bound to a different computing service, typically referred to as an integration. For example, an HTTP GET on the path /my_resource may direct the request to integrate with a Lambda function for processing, while an HTTP POST on the same path may direct the request to integrate with ECS.

API Gateway streamlines several common aspects of API management:

Staged Deployments

API Gateway allows APIs to be deployed in stages, where the developer defines what those stages are (e.g. dev vs prod, different versions of the same API, etc..). Stages allow for concurrent deployment of the same API. Each stage is given a unique HTTP endpoint, similar in form to https://{api-id}.execute-api.{region}.amazonaws.com/{stage}.

API Gateway also allows canary deployments: A certain percentage of traffic gets sent to a canary stage, where canary is referring to an unproven version of the API. Assuming that error rates don't increase with the canary, the canary moves into place as the current deployment.

Kroki diagram output

Each stage may optionally have rate limiting and / or throttling applied.

Security

↩PREREQUISITES↩

To authorize access to an API, API Gateway has built-in support for ...

To limit access to an API, API Gateway allows setting policies on individual resources (paths). These policies can limit access based on CIDR blocks, IAM roles, and or VPCs / VPC endpoints. IAM policies can also be used to define which IAM entities (IAM roles, IAM groups, etc..) have access to the API.

To secure an API, Web Application Firewall (WAF) may be placed in front of API Gateway to protect it from common exploits and abuse (e.g. SQL injection).

Endpoint Types

An API can be one of several endpoint types:

Each API endpoint type has different quirks associated with it. For example, an edge-optimized API endpoint may benignly transform the HTTP headers prior to passing it downstream for further processing.

Batch

↩PREREQUISITES↩

Batch is a service for running each item in a set of data, referred to as a batch, over the same computation. Computations are packaged as container images, where Batch schedules and executes those container images over the set of data. The compute resources used to execute the container images are typically provisioned and managed by Batch, but may also be managed directly by the user (e.g. you provide your own EC2 or ECS cluster).

Batch organizes workloads into the following components:

Kroki diagram output

Batch integrates with CloudWatch to support monitoring and logging.

Cognito

↩PREREQUISITES↩

Cognito is a service for authentication and authorization that can either integrate with third-party identity providers (e.g. Google, Facebook, Apple, Active Directory, custom enterprise identity providers, etc..) or act as its own identity provider (e.g. Cognito directly stores and manages identities).

Cognito handles authentication and authorization using the concepts of user pools and identity pools:

⚠️NOTE️️️⚠️

The documentation for identity pools explicitly mentions "federated" identities, probably meaning that it's used for linking outside identities to IAM roles.

Both pool types are covered in the subsections below.

User Pool

↩PREREQUISITES↩

Cognito integrates with an application / client via a user pool: A user pool stores and maintains a user directory and provides control over the authentication and authorization flow. For example, a user pool can control ...

User pools support placing users into groups, where each group may have a certain set of privileges (e.g. admin group vs normal group). A user can belong to multiple groups. Each group may be assigned ...

In the background, each user pool acts as an OpenID Connect (OIDC) identity provider (IdP). Once authenticated, the client receives JSON Web Tokens (JWT) for ...

A client proves it has necessary access rights by passing along JWTs in its request. To authenticate, Cognito provides a hosted authentication page or other AWS services either integrate directly with Cognito or provide integration functionality. For example, AWS provides a suite of development tools called Amplify, which provides React components for authentication with Cognito.

Identity Pool

An identity pool allows authenticated and unauthenticated users to assume an IAM role. Those users are given temporary credentials which they can then use to interface with the AWS API directly as the IAM role.

⚠️NOTE️️️⚠️

The documentation for identity pools explicitly mentions "federated" identities, probably meaning that it's used for linking outside identities.

⚠️NOTE️️️⚠️

Learning material says there are two modes of operation here, but Google isn't showing anything.

Different IAM roles can be assigned for authenticated vs unauthenticated users.

Terminology